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1 Introduction 

Random matrix theory has been proved to be a powerful tool in a wide range of fields 
including statistics, high-energy physics, electrical engineering and number theory. Tradi- 
tionally the primary focus is on the spectral analysis of eigenvalues and eigenvectors. See, 
for example, Johnstone (2001 and 2008), Bai, Miao and Pan (2007), and Jiang (2004b). 
For general background on the random matrix theory, see, for example, Bai and Silverstein 
(2009) and Anderson, Guionnet, and Zeitouni (2009). 

In statistics, the random matrix theory is particularly useful for inference of high- 
dimensional data which is becoming increasingly available in many areas of scientific inves- 
tigations. In these applications, the dimension p can be much larger than the sample size 
n. In such a setting classical statistical methods and results based on fixed p and large n 
are no longer applicable. Examples include high-dimensional regression, hypothesis testing 
concerning high-dimensional parameters, and inference on large covariance matrices. See, 
for example, Candes and Tao (2007), Cai, Wang and Xu (2010a), Bai and Saranadasa 
(1996), Bai, Jiang, Yao and Zheng (2009), and Cai, Zhang and Zhou (2010). 

In the present paper we study the limiting laws of the coherence of an n x p random 
matrix, which is defined to be the largest magnitude of the off-diagonal entries of the sample 
correlation matrix generated from the n x p random matrix. We are especially interested 
in the case where p ^ n. This is a problem of independent interest. Moreover, we are 
particularly interested in the applications of the results to testing the covariance structure of 
a high-dimensional Gaussian variable and the construction of compressed sensing matrices. 
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These three problems are important in their respective fields, one in random matrix theory, 
one in statistics and one in signal processing. The latter two problems are seemingly 
unrelated at first sight, but as we shall see later they can both be attacked through the use 
of the limiting laws of the coherence of random matrices. 

1.1 Limiting Laws of the Coherence of a Random Matrix 

Let Xji = (xij) be an n X p random matrix where the entries Xij WCQ 1.1. d. real random 
variables with mean fj, and variance > 0. Let xi,X2, - ■ ■ ,Xp be the p columns of X„. 
The sample correlation matrix r„ is defined by r„ := (pij) with 

I I i I I II J 3 I I 

where = (1/n) Xi]^ and || • || is the usual Euclidean norm in M". Here we write Xi — Xi 
for Xi — XiC, where e = (1, 1, • • • , 1)-'" G M". In certain applications such as construction 
of compressed sensing matrices, the mean fi of the random entries Xij is known (typically 
^ = 0) and the sample correlation matrix is then defined to be r„ := (pij) with 

fe-ffe-^) !<;,,■< p. (2) 

I I 3/ 2 f-f/ II * II j ^ I I 

One of the main objects of interest in the present paper is the largest magnitude of the 
off-diagonal entries of the sample correlation matrix, 

Ln = max \piA and L„ = max \pij\. (3) 

i<«<i<p ^<i<j<P 

In the compressed sensing literature, the quantity L„ is called the coherence of the ma- 
trix Xn- A matrix is incoherent when L„ is small. See, for example, Donoho, Elad and 
Temlyakov (2006). With slight abuse of terminology, in this paper we shall call both Ln 
and Ln coherence of the random matrix Xn, the former for the case p is unknown and the 
latter for the case p is known. The first goal of the present paper is to derive the limiting 
laws of the coherence in the high dimensional setting. 

In the case where p and n are comparable, i.e., n/p — )• 7 G (0, 00), asymptotic properties 
of the coherence Ln of random matrix X„ have been considered by Jiang (2004a), Zhou 
(2007), Liu, Lin and Shao (2008), and Li, Liu and Rosalsky (2009). In this paper we focus 
on the high dimensional case where p can be as large as e"'' for some < /3 < 1. This is a 
case of special interest for the applications considered later. 

The results given in Section [2] show that under regularity conditions, 

/ P 

\/n/ log pLn ^ 2 as n — )■ 00 



3 



p 

where — )• denotes convergence in probability. Here and throughout the paper the log is the 
natural logarithm logg . Furthermore, it is shown that nL^ — 41ogp + log log p converges 
weakly to an extreme distribution of type I with distribution function 

F{y) = e ^ , y G M. 

Same results hold for L„. In contrast to the known results in the literature, here the 
dimension p can be much larger than n. In the special cases where Xij are either bounded 
or normally distributed, the results hold as long as logp = o(n^/^). 

In addition, motivated by application to testing covariance structure, we also consider 
the case where the entries of random matrix Xn are correlated. More specifically, let Xn = 
ixij)i<i<n,i<j<p, where the n rows are i.i.d. random vectors with distribution Np{fi,T,). 
For a given integer r > 1 (which can depend on n or p), it is of interest in applications to 
test the hypothesis that the covariance matrix S is banded, that is, 

Hq : (Jij = for all \i - j\ > r. (4) 
Analogous to the definition of L„ and L„, we define 

Ln,T = max \pij\ (5) 
when the mean /i is assumed to be unknown and define 

Ln,T = max \pij\ (6) 
\^-j\>T- 

when the mean fj. = (/ii, /i2, fJ-p) is assumed to be known. In the latter case pij is defined 
to be 

P^^ = P^^4^T^^^^ l<i,J<p. (7) 

1 1 f^i 11*11 '^j f^j 1 1 

We shall derive in Section [2] the limiting distribution of Ln^r and Ln^r under the null hy- 
pothesis Hq and discuss its application in Section O The study for this case is considerably 
more difficult technically than that for the i.i.d. case. 



1.2 Testing Covariance Structure 

Covariance matrices play a critical role in many areas of statistical inference. Important 
examples include principal component analysis, regression analysis, linear and quadratic 
discriminant analysis, and graphical models. In the classical setting of low dimension and 
large sample size, many methods have been developed for estimating covariance matrices 
as well as testing specific patterns of covariance matrices. In particular testing for inde- 
pendence in the Gaussian case is of special interest because many statistical procedures are 
built upon the assumptions of independence and normality of the observations. 
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To be more specific, suppose we observe independent and identically distributed p- 
variate random variables Yi, . . . , Y„ with mean = fipxi, covariance matrix S = Spxp 
and correlation matrix R = Rpxp- In the setting where the dimension p and the sample 
size n are comparable, i.e., n/p — )■ 7 G (0, 00), testing of the hypotheses Hq : S = / versus 
Ha : S 7^ /, assuming ^ = 0, has been considered by Johnstone (2001) in the Gaussian 
case and by Peche (2009) in the more general case where the distribution is assumed to 
be sub-Gaussian and where the ratio p/n can converge to either a positive number 7, or 
00. The test statistic is based on the largest eigenvalue of the sample covariance matrix 
and relies on the important results in their papers that the largest eigenvalue of the sample 
covariance matrix follows the Tracy- Widom distribution asymptotically. 

The hypothesis Hq : S = / is too restrictive for many applications. An arguably more 
practically important problem is testing for independence in the Gaussian case. That is, 
one wishes to test the hypothesis Hq : S is diagonal against the hypothesis Ha : S is not 
diagonal, or equivalently in terms of the correlation matrix R, one wishes to test Hq : R = I 
versus Ha : R ^ I. Tests based on the largest eigenvalue of the sample covariance matrix 
cannot be easily modified for testing these hypotheses. 

In this paper, we consider testing more general hypotheses on the covariance structure 
of a high dimensional Gaussian distribution which includes testing for independence as 
a special case. More specifically, we consider testing the hypothesis that S is banded 
with a given bandwidth r (which may depend on n or p), i.e., the variables have nonzero 
correlations only up to lag r. In other words, for a given integer r > 1, we wish to test the 
hypothesis Hq: aij = for all — > r. This problem arises, for example, in econometrics 
when testing certain economic theories and in time series analysis. See Andrews (1991), 
Ligeralde and Brown (1995) and references therein. The special case of r = 1 corresponds 
to testing for independence. We shall show that the limiting laws of Ln,r developed in the 
present paper can be readily applied to construct a convenient test for the handedness of 
the covariance matrix. In the special case of r = 1, the limiting laws of the coherence of 
the data matrix Y play a critical role in the construction of the test. 

1.3 Construction of Compressed Sensing Matrices 

In addition to testing the covariance structure, another important application of our results 
on the limiting laws of the coherence of a random matrix is to the construction of com- 
pressed sensing matrices. Compressed sensing is a fast developing field which provides a 
novel and efficient data acquisition technique that enables accurate reconstruction of highly 
undersampled sparse signals. See, for example, Donoho (2006a). It has a wide range of 
applications including signal processing, medical imaging, and seismology. In addition, 
the development of the compressed sensing theory also provides crucial insights into high 
dimensional regression in statistics. See, e.g., Candes and Tao (2007), Bickel, Ritov and 
Tsybakov (2009), and Candes and Plan (2009). 
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One of the main goals of compressed sensing is to construct measurement matrices 
^nxp, with the number of measurements n as smah as possible relative to p, such that for 
any fc-sparse signal /3 £ MP, one can recover /3 exactly from linear measurements y = Xf3 
using a computationally efficient recovery algorithm. In compressed sensing it is typical 
that p ^ n, for example, p can be order e"'' for some < /3 < 1. In fact, the goal is often 
to make p as large as possible relative to n. It is now well understood that the method of li 
minimization provides an effective way for reconstructing a sparse signal in many settings. 
In order for a recovery algorithm such as £i minimization to work well, the measurement 
matrices Xnxp must satisfy certain conditions. Two commonly used conditions are the so 
called restricted isometry property (RIP) and mutual incoherence property (MIP). Roughly 
speaking, the RIP requires subsets of certain cardinality of the columns of X to be close to 
an orthonormal system and the MIP requires the pairwise correlations among the column 
vectors of X to be small. See Candes and Tao (2005), Donoho, Elad and Temlyakov (2006) 
and Cai, Wang and Xu (2010a, b). It is well known that construction of large deterministic 
measurement matrices that satisfy either the RIP or MIP is difficult. Instead, random 
matrices are commonly used. Matrices generated by certain random processes have been 
shown to satisfy the RIP conditions with high probability. See, e.g., Baraniuk, et. al. 
(2008). A major technical tool used there is the Johnson-Lindenstrauss lemma. Here we 
focus on the MIP. 

The MIP condition can be easily explained. It was first shown by Donoho and Huo 
(2001), in the setting where X is a concatenation of two square orthogonal matrices, that 
the condition 

{2k - 1)L„ < 1 (8) 

ensures the exact recovery of /3 when f3 has at most k nonzero entries (such a signal is called 
fe-sparse). This result was then extended by Fuchs (2004) to general matrices. Cai, Wang 
and Xu (2010b) showed that condition ([8]) is also sufficient for stable recovery of sparse 
signal in the noisy case where y is measured with error. In addition, it was shown that this 
condition is sharp in the sense that there exist matrices X such that it is not possible to 
recover certain fc-sparse signals /3 based on y = Xf3 when (2k — 1)L„ = 1. 

The mutual incoherence property ([8]) is very desirable. When it is satisfied by the 
measurement matrix X, the estimator obtained through £i minimization satisfies near- 
optimality properties and oracle inequalities. In addition, the technical analysis is particu- 
larly simple. See, for example, Cai, Wang and Xu (2010b). Except results on the magnitude 
and the limiting distribution of L„ when the underlying matrix is Haar-invariant and or- 
thogonal by Jiang (2005), it is, however, unknown in general how likely a random matrix 
satisfies the MIP ([8|) in the high dimensional setting where p can be as large as e"*^ . We 
shall show in Section [J] that the limiting laws of the coherence of random matrices given 
in this paper can readily be applied to compute the probability that random measurement 
matrices satisfy the MIP condition ([8]). 
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1.4 Organization of the Paper 

The rest of the paper is organized as foHows. We begin in Section [2] by studying the 
hmiting laws of the coherence of a random matrix in the high-dimensional setting. Section 
[3] considers the problem of testing for independence and handedness in the Gaussian case. 
The test statistic is based on the coherence of the data matrix and the construction of the 
tests relies heavily on the asymptotic results developed in Section [2j Application to the 
construction of compressed sensing matrices is considered in Section HI Section [5] discusses 
connections and differences of the our results with other related work. The main results 
are proved in Section [H] and the proofs of technical lemmas are given in the Appendix. 

2 Limiting Laws of Coherence of Random Matrices 

In this section, we consider the limiting laws of the coherence of a random matrix with 
i.i.d. entries. In addition, we also consider the case where each row of the random matrix 
is drawn independently from a multivariate Gaussian distribution with banded covariance 
matrix. In the latter case we consider the limiting distribution of Ln^r and Ln^r defined in 
([5]) and ([6]) . We then apply the asymptotic results to the testing of the covariance structure 
in Section [3] and the construction of compressed sensing matrices in Section HI 

2.1 The i.i.d. Case 

We begin by considering the case for independence where all entries of the random matrix 
are independent and identically distributed. Suppose {^, xij, i, j = 1, 2, • • • } are i.i.d. real 
random variables with m6cin /i a,nci VctricincG (T^ ^ 0. Let — (^2j)i<i<n,i<j<p 

and let 

xi,X2, ■ ■ ■ ,Xphe the p columns of X„. Then Xn = {xi,X2, • • • , Xp). Let x^ = (!/"-) X^ILi ^ik 
be the sample average of x^- We write Xj — Xj for Xj — Xje, where e = (1, 1, • • • , 1)^ € M". 
Define the Pearson correlation coefficient pij between Xi and Xj as in ([1]). Then the sample 
correlation matrix generated by X„ is r„ := (pij), which is a p by p symmetric matrix with 
diagonal entries pu = 1 for all 1 < -i < p. When the mean p of the random variables xij 
is assumed to be known, we define the sample correlation matrix by r„ := {pij) with pij 
given as in ([2]). 

In this section we are interested in the limiting laws of the coherence L.„ and L„ of 
random matrix which are defined to be the largest magnitude of the off-diagonal entries 
of sample correlation matrices r„ and r„ respectively, see ([3]). The case of p » n is of 
particular interest to us. In such a setting, some simulation studies about the distribution 
of Ln were made in Cai and Lv (2007), Fan and Lv (2008 and 2010). We now derive the 
limiting laws of L„ and L„. 
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We shall introduce another quantity that is useful for our technical analysis. Define 

J.= max (9) 

i<i<j<p 

We first state the law of large numbers for L„ for the case where the random entries Xij 
are bounded. 

Theorem l Assume \xii\ < C for a finite constant C > 0, and p = p{n) — t- oo and 
\ogp = o{n) as n ^ oo. Then \/n/ log p — t- 2 in probability as n ^ oo. 

We now consider the case where Xij have finite exponential moments. 

Theorem 2 Suppose £;e*°l^'"l" < oo for some a > and to > 0. Set j3 = a/(4 + a). 
Assume p = p{n) — t- oo and logp = o{n^) as n ^ oo. Then y^n/ logp Ln — )■ 2 in probability 
as n ^ oo. 

Comparing Theorems [1] and [21 it can be seen that a stronger moment condition gives a 
higher order of p to make the law of large numbers for L„ valid. Also, based on Theorem 
< oo for any a > 0, then /3 — )• 1, hence the order o{n") is close to o(n), which 
is the order in Theorem [TJ 

We now consider the limiting distribution of L„ after suitable normalization. 

Theorem 3 Suppose £'e*ol''"l" < oo for some < a <2 and to > 0. Set /3 = a/(4 + a). 
Assume p = p{n) ^ oo andlogp = o{n^) as n —t- oo. Then nL^ — Alogp+loglogp converges 
weakly to an extreme distribution of type I with distribution function 

Remark 2.1 Propositions 16. I t 16.21 and 16.31 show that the above three theorems are still 
valid if Ln is replaced by either L„, or Jn/n, where Ln is as in ^ and J„ is as in ([9]). 

In the case where n andp are comparable, i.e., n/p — t- 7 G (0, 00), Jiang (2004a) obtained 
the strong laws and asymptotic distributions of the coherence L„ of random matrices. 
Several authors improved the results by sharpening the moment assumptions, see, e.g., 
Li and Rosalsky (2006), Zhou (2007), and Li, Liu and Rosalsky (2009) where the same 
condition n/p — )• 7 G (0, 00) was imposed. Liu, Lin and Shao (2008) showed that the same 
results hold for p — )■ 00 and p = 0{n") where q is a constant. 

In this paper, motivated by the applications mentioned earlier, we are particularly 
interested in the case where both n and p are large and p = o(e"'') while the entries of Xn 
are i.i.d. with a certain moment condition. We also consider the case where the n rows 
of Xn form a random sample from Np^/j,, S) with S being a banded matrix. In particular, 
the entries of X„ are not necessarily independent. As shown in the above theorems and in 
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Section [2.21 later, when p < for a certain /3 > 0, we obtain the strong laws and limiting 
distributions of the coherence of random matrix Xn- Presumably the results on high order 
p = o{e^^) need stronger moment conditions than those for the case p = 0{n°'). Ignoring 
the moment conditions, our results cover those in Liu, Lin and Shao (2008) as well as others 
aforementioned. 

Theorem 1.2 in Jiang (2004a) states that if n/p — ;> 7 G (0,oo) and E\S^\^^^^ < 00 for 
some e > 0, then for any y € M, 

P {nil - 4 log n + log log n < y) ^ ^-Ke-y/^ ^-^q^ 

where K = {-y^^/STr)'^ , as n — t- 00. It is not difficult to see that Theorem [3] implies Theorem 
1.2 in Jiang (2004a) under condition that n/p — )• 7 and Ee^°^^'^^^" < 00 for some < a < 2 
and io > 0. In fact, write 

nL^ — 4 log n + log log n 
= {nL\ — 4 log p + log log p) + 4 log — + ^ log log n — log log p^ . 

Theorem [3] yields that nL„ — 41ogp + log log p converges weakly to F{y) = exp 
Note that since n/p ^ ^, 

p 

4 log > —A log 7 and log(log n) — log log p — t- 0. 

n 

Now it follows from Slutsky's Theorem that nL^ — 4 log n + log log n converges weakly to 
F{y + 4 log 7), which is exactly (llOp from Theorem 1.2 in Jiang (2004a). 

2.2 The Dependent Case 

We now consider the case where the rows of random matrix Xn are drawn independently 
from a multivariate Gaussian distribution. Let X„, = (xjj)i<j<„.i<j<p, where the n rows 
are i.i.d. random vectors with distribution Np(^,T,), where fi £ is arbitrary in this 
section unless otherwise specified. Let {rij)pxp be the correlation matrix obtained from 
S = {aij)pxp- As mentioned in the introduction, it is of interest to test the hypothesis that 
the covariance matrix S is banded, that is, 

Hq : aij = for all \i — j\ > r (11) 

for a given integer r > 1. In order to construct a test, we study in this section the asymptotic 
distributions of Ln^r and Ln,T defined in (jS]) and ([6]) respectively, assuming the covariance 
matrix S has desired banded structure under the null hypothesis. This case is much harder 
than the i.i.d. case considered in Section 12.11 because of the dependence. 
For any < < 1, set 

rp,<5 = {1 < * < P ; l^ijl > 1 — (5 for some 1 < j < p with j ^ i}. (12) 



9 



Theorem 4 Suppose, as n oo, 

(i) p = Pn ^ oo with logp = o(n^/^); 

(ii) T = o{p^) for any t > 0; 

(in) for some 6 E (0, 1), \Tp^s\ = o{p), which is particularly true «/maxi<j<j<p<oo \rij\ < 
1-5. 

Then, under Hq, nL\^ — 41ogp + log logp converges weakly to an extreme distribution of 
type I with distribution function 

F{y) = e'^"""'^ , y G M. 

Similar to J„ in (l9|, we define 

t/_= max (13) 

\<i<3<p,\i-j\>T GiOj 

where we write Xj — \Xi for Xi — fiie with e = (1, 1, • • • , 1)-^ € M", = {fii, • • • , /Xp)^ and 
(T?'s are diagonal entries of E. 

Remark 2.2 From Proposition 16. 4| we know Theorem S] still holds if Ln.r is replaced 
with Un,T defined in (jl3p . In fact, by the first paragraph in the proof of Theorem HI to see 
if Theorem [3] holds for Un^r, we only need to consider the problem by assuming, w.l.o.g., 
/i = and iTj's, the diagonal entries of S, are all equal to 1. Thus, by Proposition 16.41 
Theorem m holds when Ln^r is replaced by Un,T- 

Theorem m implies immediately the following result. 



Corollary 2.1 Suppose the conditions in Theorem \^ hold, then y -^^n,T 2 in 
probability as n —)■ oo. 

The assumptions (ii) and (iii) in Theorem [J] are both essential. If one of them is violated, 
the conclusion may fail. The following two examples illustrate this point. 

Remark 2.3 Consider S = Ip with p = 2n and r = n. So conditions (i) and (iii) in 
Theorem m hold, but (ii) does not. Observe 



|(^, j); 1 < i < j < 2n, \i - j\ > = n + {n - 1) -\ + 1 



n(n + 1) 



as n — 7- oo. So Ln^r is the maximum of roughly random variables, and the dependence 
of any two of such random variables are less than that appeared in L„ in Theorem [3j The 
result in Theorem [3] can be rewritten as 

p'^ p^ 

nL'^ — 2 log — + log log — — log 8 converges weakly to F 
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as n — 7- oo. Recalling L„ is the maximum of roughly weakly dependent random vari- 

2 2 

ables, replace L„ with Ln,T and with to have nL'^ .^ — 2 log ^ + log log ^ — log 8 
converges weakly to F, where F is as in Theorem [3j That is, 

(nL^ r ~ 4 log P + log logp) + log 16 converges weakly to F (14) 

as n — 7- oo (This can be done rigorously by following the proof of Theorem[3]). The difference 
between (|14p and Theorem |4] is evident. 

Remark 2.4 Let p = mn with integer m > 2. We consider the p x p matrix T, = 
diag{Hn, ■ ■ ■ iHn) where there are m H^s in the diagonal of S and all of the entries of 
the n X n matrix Hn are equal to 1. Thus, if (Ci, • • • , (p) ~ Np{0, S), then On+i = On+2 = 
• • • = C{i+i)n for all < / < m — 1 and (i, Cn+i, • • • , C{m-i)n+i are i.i.d. A^(0, l)-distributed 
random variables. Let {Qj', ^ ^ i ^ n,l < j < m} be i.i.d. iV(0, l)-distributed random 
variables. Then 

( Cil; • -J ; CilXi2, ' 'J , Ci2^ ^ ' ' ' Xim,- 'J i Cim) ' G , 1 < « < 
n n n 

are i.i.d. random vectors with distribution Np{0, S). Denote the corresponding data matrix 
by {xij)nxp- Now, take t = n and m = [e"^^^]. Notice Vp^s = p for any 5 > 0. Since p = mn, 
both (i) and (ii) in Theorem U] are satisfied, but (iii) does not. Obviously, 

Ln^T = max \Pij\ = max \pij\, 

l<i<j<p,\i—j\>T ^<i<j<m 

where pij is obtained from (Qj) 

nxm as in ([T|) (note that the uin entries of {Cij^nxm are 
i.i.d. with distribution A^(0, 1)). By Theorem [3] on maxi<i<j<m \Pij\, we have that nL^ ^ — 
4 log m + log log m converges weakly to F, which is the same as the F in Theorem HI Set 
log2 X = log log X for a: > 1. Notice 

nL^ T- — 4 log m + log2 m = nL^ t ~ ^ log p + A log n + log2 m 

~ (nL^ ,_- 4 log p + log2p) + 4 log n 

since p = mn and log2 p— log2 m — )■ 0. Further, it is easy to check that 4 log n— 16 log2 p — t- 0. 
Therefore, the previous conclusion is equivalent to that 

(nL^ T ~ 4 logp + log logp) + 16 log logp converges weakly to F (15) 

as n ^ oo. This is different from the conclusion of Theorem [H 

3 Testing the Covariance Structure 

The limiting laws derived in the last section have immediate statistical applications. Testing 
the covariance structure of a high dimensional random variable is an important problem in 
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statistical inference. In particular, as aforementioned, in econometrics when testing certain 
economic theories and in time series analysis in general it is of significant interest to test 
the hypothesis that the covariance matrix E is banded. That is, the variables have nonzero 
correlations only up to a certain lag r. The limiting distribution of Ln^r obtained in Section 
[2 can be readily used to construct a test for the handedness of the covariance matrix in the 
Gaussian case. 

Suppose we observe independent and identically distributed p-variate Gaussian variables 
Yi, . . . , Y„ with mean fipxi, covariance matrix Spxp = {(^ij) ai^d correlation matrix Rpxp = 
(rjj). For a given integer r > 1 and a given significant level < a < 1, , we wish to test 
the hypotheses 

Hq '■ CTij = for all \i — j\ > r versus Ha ■ (Jij ^ for some \i — j\ > r. (16) 

A case of special interest is r = 1, which corresponds to testing independence of the 
Gaussian random variables. The asymptotic distribution of L„^t- derived in Section [2.21 can 
be used to construct a convenient test statistic for testing the hypotheses in (fT6|) . 
Based on the asymptotic result given in Theorem U] that 

^ (""^^n.r - 4 log p + log log p < y) e ^ , (17) 

we define a test for testing the hypotheses in (I16p by 

T = l(^Ll r > n"^(41ogp- loglogp- log(87r) - 21oglog(l - a)^^)). (18) 

That is, we reject the null hypothesis Hq whenever 

L^^r — (^41ogp - loglogp - log(87r) - 2 log log(l — a)^^^ . 

Note that for r = 1, Ln,T reduces to L„ and the test is then based on the coherence 

Theorem 5 Under the conditions of Theorem^ the test T defined in U8\) has size a 
asymptotically. 

This result is a direct consequence of (fT7|l . 

Remark 3.1 For testing independence, another natural approach is to build a test based 
on the largest eigenvalue Amax of the sample correlation matrix. However, the limiting 
distribution of the largest eigenvalue Amax is unknown even for the case p/n — c, a finite 
and positive constant. For r > 2, the eigenvalues are not useful for testing handedness of 
the covariance matrix. 
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4 Construction of Compressed Sensing Matrices 



As mentioned in the introduction, an important problem in compressed sensing is the 
construction of measurement matrices Xnxp which enables the precise recovery of a sparse 
signal (3 from linear measurements y = Xf3 using an efficient recovery algorithm. Such a 
measurement matrix X is difficult to construct deterministically. It has been shown that 
randomly generated matrix X can satisfy the so called RIP condition with high probability. 

The best known example is perhaps n x p random matrix X whose entries iid 
normal variables 

'iV(0,n-^). (19) 



iid 




with probability |; ^^q^ 




Other examples include generating X = (xij) by Bernoulli random variables 

with probability | 
with probability ^ 

or more sparsely by 

with probability 1/6; 

with probability 2/3; (21) 
with probability 1/6. 

These random matrices are shown to satisfy the RIP conditions with high probability. See 
Achlioptas (2001) and Baraniuk, et al. (2008). 

In addition to RIP, another commonly used condition is the mutual incoherence property 
(MIP) which requires the pairwise correlations among the column vectors of X to be small. 
In compressed sensing L„, (instead of L„,) is commonly used. It has been shown that the 
condition 

{2k - l)Ln < 1 (22) 

ensures the exact recovery of A;-sparse signal /3 in the noiseless case where y = Xf3, and 
stable recovery of sparse signal in the noisy case where 

y = Xp + z. 

Here z is an error vector, not necessarily random. The MIP (j22p is a very desirable property. 
When the measurement matrix X satisfies (j22p . the constrained ii minimizer can be shown 
to be exact in the noiseless case and near-optimal in the noisy case. Under the MIP 
condition, the analysis of ^l minimization methods is also particularly simple. See, e.g., 
Cai, Wang and Xu (2010b). 

The results given in Theorems [T] and [2] can be used to show how likely a random matrix 
satisfies the MIP condition ()22p . Under the conditions of either Theorem [1] or Theorem [21 



L„ ~ 2 



logp 



n 
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So in order for the MIP condition (|22p to hold, roughly the sparsity k should satisfy 



4 y log p 

In fact we have the following more precise result which is proved in Section [6l 



Proposition 4.1 Let Xn = (xij) 

nxp where i.i.d. random variables with mean 

^, variance cj^ > and EJe^^l^^^l^ < oo for some to > 0. Let Ln he as in ^3^. Then 
P{Ln >t)< 3p2e-"f(i) where g{t) = min{/i(t/2), /2(l/2)} > for any t > and 

Li{x) = sup{ex - logEe'^^'^} and L2{x) = sup{fe - logSe^^'}. 

em em 

and S,,ri, {xii — fi)/a are i.i.d. 

We now consider the three particular random matrices mentioned in the beginning of 
this section. 

Example 1. Let xn ~ N(0,n"^) as in (|19p . In this case, according to the above proposi- 
tion, we have 

P {i2k - 1)L„ < l) > 1 - 3p^ exp { - j^^^} (23) 

for all n > 2 and k > 1. The verification of this example together with the next two are 
given in the Appendix. 

Example 2. Let xu be such that P{xii = ztl/i/n) = 1/2 as in (I20p . In this case, we have 
P {i2k - 1)L„ < l) > 1 - 3p^ exp { - j^^^} (24) 
for all n > 2 and A; > 1. 

Example 3. Let xn be such that P{xii = ±y^3/n) = 1/6 and P{xii = 0) = 2/3 as in 
(1211) . Then 

P {i2k - 1)L„ < l) > 1 - 3/ exp { - j^^^} (25) 
for all n > 2 and k > 2. 



Remark 4.1 One can see from the above that (p3|) is true for all of the three examples 
with different restrictions on k. In fact this is always the case as long as Ee^°^^'^'^^^ < oo for 
some ^0 > 0, which can be seen from Lemma l6.8i 



Remark 4.2 Here we would like to point out an error on pp. 801 of Donoho (2006b) 
and pp. 2147 of Candes and Plan (2009) that the coherence of a random matrix with i.i.d. 

1^ . /2i°gp 



Gaussian entries is about 2\/ not 
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5 Discussion and Comparison with Related Results 



This paper studies the hmiting laws of the largest magnitude of the off-diagonal entries of the 
sample correlation matrix in the high-dimensional setting. Entries of other types of random 
matrices have been studied in the literature, see, e.g., Diaconis, Eaton and Lauritzen (1992), 
and Jiang (2004a, 2005, 2006, 2009). Asymptotic properties of the eigenvalues of the sample 
correlation matrix have also been studied when both p and n are large and proportional to 
each other. For instance, it is proved in Jiang (2004b) that the empirical distributions of 
the eigenvalues of the sample correlation matrices converge to the Marchenko-Pastur law; 
the largest and smallest eigenvalues satisfy certain law of large numbers. However, the 
high-dimensional case of p ^ n remains an open problem. 

The motivations of our current work consist of the applications to testing covariance 
structure and construction of compressed sensing matrices in the ultra-high dimensional 
setting where the dimension p can be as large as e"^ for some < (3 < 1. The setting is 
different from those considered in the earlier literature such as Jiang (2004), Zhou (2007), 
Liu, Lin and Shao (2008), and Li, Liu and Rosalsky (2009). Our main theorems and 
techniques are different from those mentioned above in the following two aspects: 

(a) Given n — )■ oo, we push the size of p as large as we can to make the law of large 
numbers and limiting results on L„ and L„ valid. Our current theorems say that, 
under some moment conditions, these results hold as long as logp = o{n^) for a 
certain /3 > 0. 

(b) We study L,„ and L„ when the p coordinates of underlying multivariate distribution 
are not i.i.d. Instead, the p coordinates follow a multivariate normal distribution 
Np{ii, S) with S being banded and /i arbitrary. Obviously, the p coordinates are 
dependent. The proofs of our theorems are more subtle and involved than those in 
the earlier papers. In fact, we have to consider the dependence structure of S in 
detail, which is more complicated than the independent case. See Lemmas I6.10| 16.111 
and Em 

Liu, Lin and Shao (2008) introduced a statistic for testing independence that is different 
from Ln and L„ to improve the convergence speed of the two statistics under the constraint 
cin" < p < C2n" for some constants ci, C2, a > 0. In this paper, while pushing the order of p 
as large as possible to have the limit theorems, we focus on the behavior of L„ and L„ only. 
This is because L„ and L.„ are specifically used in some applications such as compressed 
sensing. On the other hand, we also consider a more general testing problem where one 
wishes to test the bandedness of the covariance matrix S in Np{fi,Ti) while allowing fi to 
be arbitrary. We propose the statistic Ln,T in dl]) and derive its law of large numbers and 
its limiting distribution. To our knowledge, this is new in the literature. It is interesting 



15 



to explore the possibility of improving the convergence speed by modifying Ln^r as that of 
Ln in Liu, Lin and Shao (2008). We leave this as future work. 

6 Proofs 

In this section we prove Theorems [1] - HI The letter C stands for a constant and may vary 
from place to place throughout this section. Also, we sometimes write p for p„ if there is 
no confusion. For any square matrix A = (aij), define |||A||| = maxi<j^j<„ \ aij\; that is, the 
maximum of the absolute values of the off-diagonal entries of A. 

We begin by collecting a few essential technical lemmas in Section 16.11 without proof. 
Other technical lemmas used in the proofs of the main results are proved in the Appendix. 

6.1 Technical Tools 

Lemma 6.1 (Lemma 2.2 from Jiang (2004a)) Recall Xi and r„ in ([7p. Let hi = \\xi — 
Xi\\/y/n for each i. Then 

WnVn - XlXnW < {bli + 2bn,i)Wnb~l + n^-Jfe^ 4, 

where 

bn,i = max \hi — 1|, Wn = max \xfxj\, 6^,3 = min hi, 6^,4 = max \xi\. 

The following Poisson approximation result is essentially a special case of Theorem 1 
from Arratia et al. (1989). 

Lemma 6.2 Let J be an index set and {Ba,a & 1} be a set of subsets of J, that is, 
Ba C / for each a £ I. Let also {?/o,a G /} be random variables. For a given t G M, set 
^ = J2a&iPiVa>t). Then 

|P(max7?„ <t)- e-^l < (1 A A-i)(6i + 62 + ^3) 

where 

b2 = Y^ PiVa>t,r]p>t), 

b3 = Y^ E\P{r]a > t\a{ii(s,f3 ^ B^)) - P{r]a > t)\, 

and a{r]i3,f] ^ Ba) is the a-algebra generated by {r]i3,(3 ^ Ba}. In particular, if rja is 
independent of {rip,(i ^ Ba} for each q, then 63 = 0. 
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The following conclusion is Example 1 from Sakhanenko (1991). See also Lemma 6.2 
from Liu et al (2008). 

Lemma 6.3 Let < i < n, be independent random variables with E^i = 0. Put 

n n n 

i=l i=l i=l 

Assume maxi<j<„ < c„s„ for some < c„ < 1. Then 

P{Sn > XSn) = e^(-/^")(l - 1>{X)){1 + a„,,.(l + X)s-^0n) 

for < X < l/(18c,„), where \^{x)\ < 2x'^Qn and \9n,x\ < 36. 

The following are moderate deviation results from Chen (1990), see also Chen (1991), 
Dembo and Zeitouni (1998) and Ledoux (1992). They are a special type of large deviations. 

Lemma 6.4 Suppose Cij^2,''' are i.i.d. r.v.'s with E^^i = and E^f = 1. Put Sn = 

(i) Let < a < 1 and {a„; n > 1} satisfy that a„ — t- +oo and a„ = 0(77,2(2-")). jj 

< 00 for some to > 0, then 

lim 4logp(-^>n)=-^ (26) 

for any -u > 0. 

(ii) Let < a < 1 and {a„; n > 1} satisfy that an — )• +oo and a„ = 0(n2(2-a) j. // 
£;e*l^il" < oo for all t > 0, then {S^ also holds. 

6.2 Proofs of Theorems [1] and [2] 

Recall that a sequence of random variables {Xn', n > 1} are said to be tight if, for any e > 0, 
there is a constant K > such that sup„>i P(|X„| > K) < e. Obviously, {Xn] n > 1} are 
tight if for some K > 0, lim„_s.oo -P(|^n| > K) — )• 0. It is easy to check that 

if {Xn, n > 1} are tight, then for any sequence of constants {cn', n > 1} 

with lim €n = 0, we have e„X„ — )• in probability as n — )■ oo. (27) 

n— >oo 

Reviewing the notation bn/s defined in Lemma |6.H we have the following properties. 

Lemma 6.5 Let {xij; « > 1, J > 1} be i.i.d. random variables with Exu = and Ex\i = 
1. Then, bn,3 1 in probability as n ^ oo, and {y^n/ logpbn^i} and {\/n/ log pbn^A} are 
tight provided one of the following conditions holds: 

(i) \xii\ < C for some constant C > 0, p„ — ?• oo and logp„ = o{n) as n ^ oo; 

(ii) Se*"!^"!" < oo for some < q < 2 and to > 0, and pn — ^ oo and logpn = o{n^) as 
n — )• oo, where j3 = a/{A — a). 
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Lemma 6.6 Let {xij; i > 1, J > 1} be i.i.d. random variables with \xii\ < C for a finite 
constant C > 0, Exu = and E{x1i) = 1. Assume p = p{n) — )• oo and logp = o{n) as 
n —7- oo. Then, for any e > and a sequence of positive numbers with limit t > 0, 

^ 2 / 1 \ 

"i/n ■■= E^P^(^\^ XklXk2\ >tn\/nlogp) | = O ( ^53^ j 
fc=l ^ 

as n ^ 00, where stands for the conditional probability given {x^i, 1 < k < n}. 

Lemma 6.7 Suppose {xij; i > l,j > 1} are i.i.d. random variables with Exu = 
0, E{xf^) = 1 and E'e*"!^"!" < oo for some to > and a > 0. Assume p = p{n) — t- oo and 
\ogp = o{n^) as n ^ oo, where /? = a/(4 + a). Then, for any e > and a sequence of 
positive numbers {tn} with limit t > 0, 

"^n ■■= E^P^ (^1 XklXk2\ > tnVnlogp^ } " ^ ( ) 
fc=l ^ 

as n ^ OO, where P^ stands for the conditional probability given {x^i, 1 < k < n}. 
Lemmas 16.51 16.61 and 16.71 are proved in the Appendix. 

Proposition 6.1 Suppose the conditions in Lemma \ 6.6\ hold with Xn = {xij)nxp = 
(xi, • • • , Xp). Define Wn = maxi<j<j<p \xfxj\ = maxi<i<j<p |X]fc=i XkiXkj\ ■ Then 



y/n logp 

in probability as n ^ oo. 



lim P(^^= > 2 + 2e) =0 (28) 



Proof. We first prove 



V logp 

for any e > 0. First, since {xij; i > 1, j > 1} are i.i.d., we have 

P{Wn>i2 + 2€) Vnlogp) < (^^ ■ P(jf2xkixk2\> {2 + 2e) ^/nlogp) (29) 

for any e > 0. Notice i?(|xiiXi2p) = i?(|a;iip) • ii^(|xi2p) = 1. By (i) of Lemma 16.41 using 
conditions Ee^^'^^^'^^^ < oo and logp = o(n) as n — )■ oo, we obtain 



Xkixk2 



k=l 



> (2 + 2e)v/nlogpj < exp logp^ <^ (30) 
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as n is sufficiently large. The above two assertions conclude 

1 



P{Wn > (2 + 2e) ^/n\ogp) < - ^ (31) 

p6 



lim P{ < 2 - e ) = (32) 



as n — )• oo. Thus (I28p holds. Now, to finish the proof, we only need to show 

>oo V-y/nlogp 

for any e > small enough. 

Set a„ = (2 — e)\/nlogp for < e < 2 and 



(n) 

k=l 



for 1 < i, j < n. Then VK„ = maxi<j<;j<p for all n > 1. 



Take / = 1 < « < i < p}. For u = S /, set = {{k, I) G /; one of /c and / 

'j^f\, t = an and Au = Aij = {\y^j 



i or j, but (A;,/) / n}, r/„ = |?/^^"''|, t = a„ and A„ = = > a„}. By the i.i.d. 



assumption on {xjj} and Lemma 16.21 

P{Wn < an) < e-^" + 6l,„ + 62,n (33) 

where 

An = ^^^^P(Ai2), 6l,„ < 2/p(Ai2)2 and 62,n < 2/p(Ai2Ai3). (34) 
(n) 

Remember that is a sum of i.i.d. bounded random variables with mean and variance 
1. By (i) of Lemma 16.41 using conditions Ee^'^^'^'^^'^'^^ < oo for any i > and logp = o{n) as 
n — )■ oo, we know 

lim ^logP(^i2) = -^^^ (35) 

n-s^oo iogp 2 

for any e G (0, 2). Noticing 2 - 2e < (2 - < 2 - e for e G (0, 1), we have that 

^ < ^(^12) < ^ (36) 
as n is sufficiently large. This implies 

e-A„<g-p73 and 6i,n<^ (37) 
for e G (0, 1/4) as n is large enough. On the other hand, by independence 

P{AuA,s) = P(|yi?| >a„,|yg)| >a„) (38) 

n 

= E{P\\^XklXk2\ > anf} 
k=l 
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where stands for the conditional probability given {x^i, 1 < k < n}. By Lemma 16.61 

P{AuA,s) < (39) 
for any e > as n is sufficiently large. Therefore, taking e S (0, 1/4), we have 

b2,n < 2p^P{Ai2Ai3) < ^ (40) 



as n — oo. This together with (j33|) and ([37|) concludes ([32]) . ■ 

Proposition 6.2 Suppose the conditions in Lemma \6. 7\ hold. Let be as in Lemma 
EH Then 

Vn logp 

in probability as n —t- oo. 

The proof of Proposition 16.21 is similar to that of Proposition 16.11 Details are given in the 
Appendix. 

Proof of Theorem [TJ First, for constants /ij € M and fjj > 0, i = 1, 2, • • • ,p, it is easy to 
see that matrix Xn = {xij)nxp = {xi,X2, - ■ ■ , Xp) and {aixi+fiie, 02X2+^2^, • • • , apXp+fipe) 
generate the same sample correlation matrix r„ = (pij), where pij is as in ([TJ and e = 
(1, • • • , 1)' E M". Thus, w.l.o.g., we prove the theorem next by assuming that {x^-; 1 < i < 
n,l < j < p} are i.i.d. random variables with mean zero and variance 1. 
By Proposition 16.11 under condition logp = o(n), 



y/ n log p 

in probability as n — )• 00. Thus, to prove the theorem, it is enough to show 

nLn - Wn 



(41) 



yjn log p 

in probability as n — t- 00. From Lemma |6.H 



(42) 



\nLn - Wn\ < \\nTn - XlXnW < {bl, + 2bn,i)Wnb-l + nb-lbl. 



By (i) of Lemma l6.5[ bn,3 — )• 1 in probability as n — )■ 00, {^/n/ logpbn,i} and {\/n/ log pbn^i} 
are all tight. Set b'^ ^ = y^n/logp6„^i and b'^^ = \/ n / log p bn^A for all n > 1. Then 1} 
and {6'„4} are both tight. It follows that 



\nLn-Wn\ logp / /logp ,2 , oA' 1 h-2 , I^OE.Pr-2r/2 

Vnlogp V n \ V n ' ' / y/nlogp ' \ n ' ' 
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which concludes (gSD by (fTF]) . 



Proof of Theorem [2j In the proof of Theorem [H replace "Proposition 16. If with "Propo- 
sitioning]' and "(i) of Lemma 16.51 ' with "(ii) of Lemma 16.51 '. keep all other statements the 
same, we then get the desired result. ■ 

Proof of Proposition 14. 1[ Recall the definition of L„ in ([3]), to prove the conclusion, 
w.l.o.g., we assume fi = and = 1. Evidently, by the i.i.d. assumption, 

2 V ||xi|| • ||X2|| / 

< ^pfMM>i)+^.2pf^<i') (43) 
-2Vn-2/ 2 \ n - 2j ^ ^ 

where the event {||xii|p/n > 1/2, ||xi2|P/n > 1/2} and its complement are used to get 
the last inequality. Since {xij; i > 1, j > 1} are i.i.d., the condition EJe*"'^^^'^ < oo implies 
^e*ol^"^i2l < oo for some I'q > 0. By the Chernoff bound (see, e.g., p. 27 from Dembo and 
Zeitouni (1998)) and noting that £^(2:11X12) = and Exf^ = 1, we have 

for any n > 1 and t > 0, where the following facts about rate functions Ii{x) and l2{y) are 
used: 

(i) Ii{x) = if and only if x = 0; l2{y) = if and only if y = 1; 

(ii) Ii{x) is non-decreasing on A := [0,oo) and non- increasing on A"^. This is also true 
for /2(y) with ^4 = [1, 00). 

These and ([l3j) conclude 

P{Ln >t)< /e-^i(*/2) + 2p2e-^2(l/2) < ^p2^-ngit) 

where g{t) = min{/i(t/2), l2{l/2)} for any t > 0. Obviously, g{t) > for any t > from 
(i) and (ii) above. ■ 

Lemma 6.8 Let Z be a random variable with EZ = 0, EZ"^ = 1 and £'e*ol^l < 00 for 
some to > 0. Choose a > such that E{Z'^e°'\^\) < 3/2. Set I{x) = suptg]g{te - logE^e*^}. 
Then I{x) > x^ for all < x < 3a/2. 

Proof. By the Taylor expansion, for any x G M, e^' = 1 + x + ^e^^ for some 9 G [0, 1]. It 
follows from EZ = that 

Ee'^ = 1 + ^^E{Z\''^) < 1 + ^i?(z2e*l^l) < 1 + ^t' 
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for all < t < a. Use the inequality log(l+x) < x for all x > —1 to see that log Ee^'^ < 3t^/4 
for every < t < a. Take to = 2x/3 with x > 0. Then < to < « for all < x < 3a/2. It 
follows that 

I{x)>tox-^tl = ^. U 
6.3 Proof of Theorem H 

Lemma 6.9 Let ^i,--- ,Cn be i.i.d. random variables with E^i = 0, E^^l = 1 and 
Ee*ol«il" < oo for some to > and < a < 1. Put Sn = TJi=iii and /3 = a/{2 + a). 
Then, for any n > 1} with < p„ — t- oo and logp„ = o{n^) and {un, n > 1} with 

^/ Sn > y\ p;^^'/'(logp„)-V2 



■ ^/n log Pn ' V^Tiy 
as n ^ oo. 

Proposition 6.3 Let {xjj; i > l,j > 1} be i.i.d. random variables with Exu = 0, 
E{xli) = 1 and E'e*"!^"!" < oo for some < a <2 and to > 0. Set /3 = a/(4 + a). Assume 
p = p{n) — )• oo and logp = o{n^) as n ^ oo. Then 

P < ^ e-^^-"' 

as n ^ oo for any z G M, where an = 4nlogp — nlog(logp) and K = (-v/Svr)"-'^. 
Proof. It suffices to show that 

p{ max \yiA < ^ an + nz \ e'^""'^'^ , (44) 

where yij = Yl^=i^kiXkj- We now apply Lemma [612] to prove Take / = {{i,j);l < 

i < j ^ p}- For u = {i,j) G /, set = \yij\ and Bu = {(^,0 ^ -^i o^^^ of ^ ^ = 
i or j, but {k,l) / u}. Let a„ = ^/E^^^Tnz and Ay = {\yij\ > a„}. Since {i,j) G /} 
are identically distributed, by Lemma [6121 



[^'(M^n < an) - e^^" I < + 62,n (45) 

where 

An = ^^^^P(Ai2), < 2p^P{Ai2f and 62,™ < 2/^(^12^13). (46) 



We first calculate Ar,. Write 



o2 
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and yi2 = X]"=i^i; where {^j; 1 < i < n} are i.i.d. random variables with the same 
distribution as that of xiij;i2- In particular, E'^i = and ES^I = 1. Note ai := a/2 < 1. 
We then have 

Hence, by independence, 

Let Un = \/(^ + -2)/ logp. Then ?/„ — )• 2 as ?i — )• oo. By Lemma |6.9 



in \j n J \^Jn\ogp 



p-y"/2(logp)-V2 1 



as n — 7- oo. Considering Exij = 0, it is easy to see that the above also holds if yi2 is replaced 
by —yi2- These and (jl7|) imply that 



^2 ^ ^-2/2 g-^/2 



A„ ~ ^ ^ • 2 • ^= • 4 ~ (48) 



as n ^ oo. 

Recall (05]) and (06]), to complete the proof, we have to verify that — )■ and 62,n — ^ 
as n ^ oo. By (06]), (07]) and (08]), 



12) 



(j)2 _ pY 

as n — )• oo. Also, by (06]) . 

h2,n < 2p^P{\yi2\ > y/un + nz, lyial > V^n + nz 



= 2p^E^P^(^\^XkiXk2\ > tn^nlogp^ }. 
fc=i 

where stands for the conditional probability given {xk,i', 1 < A; < n}, and : = 
\/an + nz/\/nlogp — )• 2. By Lemma [6.71 the above expectation is equal to 0{p'^~'^) as 
n — 7- oo for any e > 0. Now choose e G (0, 1), then 62, n = ©(p*^"^) — )• as n — )• 00. The proof 
is then completed. ■ 

Proof of Theorem [3j By the first paragraph in the proof of Theorem[Tl w.l.o.g., assume 
11 = and £7 = 1. From Proposition 16.31 and the Slusky lemma, it suffices to show 

"^^■-■^■^0 (49) 

n 
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in probability as n — ^ cxd. Let A„ = |nL„ — Wn\ for n > 1. Observe that 

\n^Ll - Wl\ = \nLn - Wn\ ■ \nLn + Wn\ < A„ • (A„ + 2Wn). 
It is easy to see from Proposition 16.31 that 

2 



Wr,. 



y/n logp 

in probability as n — )• oo. By Lemma l6.lt 

A„ < |||nr„ - X^X^W < {bl^ + 2bn,i)Wnb-l + nb-lbl^. 



(50) 



(51) 



By (ii) of Lemma l6.5l bn,3 — )• 1 in probability as n — )• oo, {y^n/ logpbn^i} and {\/n/ logpbn,i} 
are tight. Set b'^ -^ = ^nj logp6„^i and 6^ 4 = y^n/logp6„^4 for all n > 1. Then and 
{^n,4} ^-^^ tight. It follows that 



A, 



< 



] ^JnVogp 



logp \ V n 
which combining with (j5ip yields that 

I- log p J 

This and (fSTI) imply that {A'„} and {W^} are tight, where A'„ := A„/logp and : 
Wnl\Jn logp. From ([50]) and then (pT)) . 

|n2L2 - Wl\ (logp)A'„{(logp)A'„ + l^f^T^W'^ 



(52) 



< 



n 



n 



(53) 



in probability as n — )■ 00 since logp = o(n'^/^). This gives ([Ml). ■ 
6.4 Proof of Theorem [4] 

We begin to prove the Theorem |4] by stating three technical lemmas which are proved in 
the Appendix. 

Lemma 6.10 Let {{uki,Uk2,Uk3,UkA)^; 1 <i <7i} be a sequence ofi.i.d. random vectors 
with distribution A^4(0, S4) where 



r < 1. 



/I 





r 


o\ 





1 








r 





1 





\0 








1/ 
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Set ttn = (4nlogp — nlog(logp) + ny)^/"^ for n > and y € M. Suppose n — ^ oo, p — ^ oo 
with logp = o(n^/^). Then, 



n n ^ 

sup P{\^UklUk2\ > l^^Uk-sUkil > an) = — 



|r|<l 



(54) 



k=l 



k=l 



for any e > 0. 



Lemma 6.11 Let {{uki,Uk2,Uk3,Uk4)'^] l < « < n} be a sequence ofi.i.d. random vectors 
with distribution A'4(0, S4) where 



(i n o\ 

1 ra 

ri r2 1 

VO 1/ 



, Inl < 1, k2| < 1- 



Set an = (4?7-logp — nlog(logp) + ny)^^"^ for n > e*^ and y £ M. Suppose n — t- 00, p — t- 00 
with logp = o(n^/^). Then, as n ^ 00, 

n n 

sup P{\y^UklUk2\ > an, I V'tifc3'"fc4| > ) = 0(p~3' 
kl|,|r2|<l ^ ^ ' ^ 



k=\ 



for any e > 0. 



Lemma 6.12 Let {{uki, Uk2, Uks, u^i)'^ ; 1 <i <n} be a sequence ofi.i.d. random vectors 
with distribution A'^4(0, S4) where 



, Inl < 1, IrsI < 1. 



Set On = {Anlogp — nlog(logp) + ny)^/"^ for n > e^ and y € M. Suppose n — )■ 00, p — )■ 00 
with logp = o(n^/^). Then, for any 5 G (0, 1), there exists eo = e((5) > such that 



(1 





n 


0^ 





1 





r2 


n 





1 







r2 








sup P{\y^UklUk2\ > an, \'y^Uk3Uk4\ > an) = 0(p ^ 
\ri\,\r2\<l-5 ^ ^1 ^ ^ 

as n ^ 00. 

Recall notation r, S = {(Tij)pxp and X„ = {xij)nxp ~ -^p(^)S) above (fTT 
Proposition 6.4 Assume = and CTjj = 1 /or a/n < i < p. Define 

rp 

Vn = Ki,r = max |Xj Xj|. 



(55) 



(56) 
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Suppose n — ?• oo, p = p„ — ?■ oo with logp = o(n-^/^), r = o(p*) /or an?/ t > 0, and for some 
6 G (0, 1), \Tp^s\ = o{p) as n ^ oo. Then, under Hq in 177]) . 

as n —)• oo /or any y G R, where a„ = 4nlogp — nlog(logp) and K = (-v/Svr)"^. 
Proof. Set a„ = (4nlogp — nlog(logp) + ny)^^"^, 

^P = lihj); <i < j <P, j -i>r, max {\rik\} < 1 - S, max {|rjfc|} < 1 - (5 [, 

n 

= max V] XfciXfcj . (57) 
(i,i)eA, I ^ 

^tep 1. We claim that, to prove the proposition, it suffices to show 

lim P (V.^ < On) = e-^^~''^ (58) 

for any y S M. 

In fact, to prove the theorem, we need to show that 

Um P [Vn >an) = l- e-^^^"'^ (59) 

n— >oo 

for every y G M. Notice {x]^i, x^j', 1 < k < n} are 2n i.i.d. standard normals if \j — i\ > r. 
Then 

n 

P {Vn > an) < P (Kl > an) + ^P{\Y1 ^klXkr+ll > 

fe=l 

where the sum runs over all pair such that 1 < i < j < p and one of i and j is in Tp^g. 
Note that |xiiXit-+i| < {x'l^+xl^_^_i)/2, it follows that E^e'^^i^i^+il/^ < oo by independence 
and -E'exp(A^(0, 1)^/4) < oo. Since {x/d, x^t+i; 1 < /c < n} are i.i.d. with mean zero and 
variance one, and y^ := Un/y/nlogp — t- 2 as n — >• oo, taking q = 1 in Lemma \6.9\ we get 

1 " a 
P( / , I y]a;fcia;fer+i| > , 
Vyn logp ynlogp 

p^^^/2(logp)^i/2 e-y/^ 1 
~ 2 = ~ — • — (60) 

as n — 7- oo. Moreover, note that the total number of such pairs is no more than 2p|rp ,5|. 
Therefore, 

n 

P >an)<P {Vn > On) < P (K > «„,) + 2p \Tp,s\ " ^(l XklXkr+l\ > «n,) 

k=l 

< p(y>a„)+o(p2).o(^) (61) 
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by the assumption on Tp^s and ([60]) . Thus, this joint with ([59]) gives ([58]) . 

Step 2. We now apply Lemma [6T2] to prove ([58]) . Take / = Ap. For G /, set Zij = 

En I 

Bij = {{k,l) G Ap; |s — t| < r for some s G {k,l} and some t G {i, j}, but (/c, 7^ 
an = Van + ny and A^- = > a„}. 

It is easy to see that \Bij\ < 2-{2t+2t)p = Srp and that are independent of {Zf^-i; {k, I) G 
Ap\Bij} for any G Ap. By Lemma [ 



\P{Vn < an) - e-^"| < + 62,n (62) 

where 

Xn = \Ap\ ■ PiAir+i), 6i,n < ^ Yl ^(^is)' = Srp"^ P{Air+if and (63) 

b2,n <Y.Y1 P^^d > t, Zd' > t) (64) 

from the fact that {Zij; G Ap} are identically distributed. We first calculate A„. By 
definition 



I 

y > |Ap| > l<i <j <p, j -i>T} 

p—T 



^{p -T -i + l) - 2p - iTp^sl- 



1=1 



Now the sum above is equal to Yl^=i 3 — iv ~ t){p — t + 1)/2 ~ p^/2 since r = o{p). By 
assumption iTp^^l = o{p) we conclude that 



|Apl - y (65) 



as n — >• 00. It then follows from (1601) that 



p2 1 

An ~ — • — 7=- ■ ~ — 7=- (66) 

2 V2^ p2 ^ V ; 

as n — 00. 

Recall ([52]) and ([55]) . to complete the proof, we have to verify that 5i „ — )• and 62,n — ^ 
as n — )• 00. Clearly, by the first expression in (|63p . we get from (|66|) and then (|65|) that 

61,. < 8r/P(Ai.+0' = ^^^ = of-)^0 

K*-pl VP/ 

as n —7- 00 by the assumption on r. 
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Step 3. Now we consider 62,n- Write d = {di,d2) € Ap and d' = (^3,^4) G Ap with di < d2 
and ds < di. It is easy to see from ([6l|) that 



where the sum runs over every pair {d, d') satisfying 
d, d' G Ap, d / d' , di < d^ and \di — dj\ < r for some i G {1, 2} and some j G {3, 4}. (67) 
Geometrically, there are three cases for the locations of d = (^1,^2) and d' = ((^3,^4): 

(1) d2 < ds; (2) di<d3<d4< ^2; (3) di < d^ < d2 < d^. (68) 
Let $7j be the subset of index (d, d') with restrictions (I67p and (j) for j = 1, 2, 3. Then 



^2,ri < 2 ^ ^ -P(^d > a„, Zrf/ > an). 
i=l {dA')&Q.i 



(69) 



We next analyze each of the three sums separately. Recall all diagonal entries of S in 
Np{0, S) are equal to 1. Let random vector 



{wi,W2, ■■■ ,Wp) Np{0, S). 



(70) 



Then every Wi has the distribution of A^(0, 1). 

Case (1). Evidently, ([67]) and (1) of §8^ imply that < ^3 - ^2 < r. Hence, < rp^. 
Further, for {d,d') G r^i, the covariance matrix of (wd^, 10(12, Wd^^Wd^) equal to 

/I 0\ 
1 7 
7 1 
VO 1/ 

for some 7 G [—1, 1]. Thus, the covariance matrix of {wd2, Wd^ , Wd^ , Wd^) is equal to 

/l 7 0\ 
10 

7 1 

Vo 1/ 



Recall Zd = Zd^^^ = ^d2,di = \Ylk=i^kdiXkd2\ defined at the beginning of Step 2. By 
Lemma 16.101 for some e > small enough. 



P{Zd > an, Zd' > an) 

{d,d')eni 



P{Zd2,di > an, Zd^ ,d4 ^ an) 

{d,d')&Qi 



< Tp^-0 



o 



p 



1-e 



(71) 
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as n — )■ (X) since r = o(p*) for any t > 0. 

Case (2). For any {d,d') E Q2, there are three possibihties. 

(I): \di — d^l < T and \d2 — d^l < r; (II): |<ii — dal < r and \d2 — d^l > r; (III): |di — lial > r 
and \d2 — d^l < r. The case that \di — d^l > t and \d2 — d^l > r is excluded by (f67|) . 

Let r22,/ be the subset of {d,d') G satisfying (I), and 0,2,11 and ^2,111 be defined 
similarly. It is easy to check that \02,i\ < r^p^. The covariance matrix of {w^^ , Wd2)Wd.^,Wd^) 
is equal to 

/I 71 0\ 
1 72 
71 1 
VO 72 1 / 



for some 71,72 G [—1, 1]. By Lemma EEl 

P(Zrf > a„, Zrf/ > a„) 



o 



p 



i<:o 



(72) 



as n 



00. 



Observe |il2,//| < Tp^ . The covariance matrix of {wd-i^,Wd2,Wd3,Wd^) is equal to 

/I 7 0\ 

10 

7 1 

VO 1/ 

By Lemma 16.101 take e > small enough to get 

r 



ItI < 1- 



^ P{Zd > an,Zd' > 
{d,d')en2,ii 



O 



i-t 



P 







(73) 



as n — )• 00. 

The third case is similar to the second one. In fact, |ri2.///| ^ tP^- The covariance 
matrix of {wdj , Wd2 , ""^ds , Wd^ ) is equal to 



/l 0\ 

1 7 

10 

VO 7 1/ 



ItI < 1- 



Thus, the covariance matrix of {wd2,Wd^,Wd^,Wdg) is equal to S4 in Lemma [6.101 Then, by 
the same argument as that in the equality in (|7ip we get 



P{Zd > an, Zd' > 

{d,d')&^2,III 



o 



p 



1-e 







(74) 
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as n — ^ oo by taking e > small enough. Combining ([72]) . ([73]) and (j7i|) . we conclude 

-P(^d > an, Zd' > an) 

as ri —)• oo. This and (j7ip together with (|69p say that, to finish the proof of this proposition, 
it suffices to verify 

^ P{Zd> an,Zd' > an)^0 (75) 
as n — 7- oo. The next lemma confirms this. The proof is then completed. ■ 



Lemma 6.13 Let the notation be as in the proof of Proposition \6^\ then \75\) holds. 

Proof of Theorem |31 By the first paragraph in the proof of Theorem [H w.l.o.g., we 

prove the theorem by assuming that the n rows of Xn = {xij)i<i<n,i<j<p are i.i.d. random 
vectors with distribution Np{Q,Ti) where all of the diagonal entries of S are equal to 1. 
Consequently, by the assumption on S, for any subset E = {ii,i2, • • • , im} of {1, 2, • • • ,p} 
with \is — i^l > r for all 1 < s < t < m, we know that {xku 1 < k < n, i £ E} are mn i.i.d. 
A^(0, l)-distributed random variables. 

Reviewing the proof of Lemma 16.51 the argument is only based on the distribution of 
each column of {xij}nxp', the joint distribution of any two different columns are irrelevant. 
In current situation, the entries in each column are i.i.d. standard normals. Thus, take 
a = 2 in the lemma to have 



1 in probability as n — oo, 



n 



bn.A and \^ T~^n,4 \ are tight (76) 
log p ) L y log P ^ 

as n — )• oo, p — )• oo with logp = o(n), where 6^,1, &n,3 and 6„^4 are as in Lemma [631 Let 
Vn = Vn,r = {vij)pxp be as in (|56]l . It is seen from Proposition 16.41 that 

^ 2 (77) 

y/n log p 

in probability as n — )■ oo, p — )• oo and logp = o{n^^^). Noticing the differences in the indices 
of maxi<j<j<p and maxi<j<j<p |j„j|>^ = Ln,T, checking the proof of Lemma 2.2 
from Jiang (2004a), it is easy to see that 

< {bl, + 2bn,l)Vn,rbnl + ^fe'l^S^^. (78) 



An '■= max 



npij 



Now, using (j76l) . (fTT]) and (fTHj) . replacing W„ with Vn,T and L„ with Ln^r ™ the proof of 
Theorem [3l and repeating the whole proof again, we obtain 

n 

in probability as n — )• oo. This joint with Proposition 16.41 and the Slusky lemma yields the 
desired limiting result for Lnr- • 
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7 Appendix 



In this appendix we prove Proposition 16.21 and verify the three examples given in Section 
m We then prove Lemmas 16.51 - 16.71 and Lemmas 16.91 - 16.131 which are used in the proof of 
the main results. 

Proof of Proposition 16.21 We prove the proposition by following the outline of the proof 
of Proposition 16 . 1 1 step by step. It suffices to show 

lim P( > 2 + 2e) =0 and (79) 



n— >oo 



y/n logp 



lim P( < 2 - e ) = (80) 



n— >oo 



y/ n log p 



for any e > small enough. Note that |xiixi2|^ = ja^nl^ • |xi2|^ < ja^iiP^ + |3;i2p^ for 
any q > The given moment condition implies that E'exp (tol^^iil^^^^^"''^) < oo. Hence 
-Bexp (|2;ii|i+'^j < oo and E'exp (|2;iixi2| ^+'' j < oo. By (i) of Lemma (f30|) holds for 
{pn} such that p„ — )• oo and logp„ = o{n^). By using (p9]) and (j3T]) . we obtain ([79|) . 

4,3 

By using condition E exp{tQ\xii \ ^+0} < oo again, we know (j35|) also holds for {pn} such 
that p„ — 7- oo and logp„ = o{n^). Then all statements after ()32p and before ()38p hold. Now, 
by Lemma [677l (j39|) holds for {pn} such that p.„ — )• oo and logp.„ = o(n^), we then have 
lOp. This implies (j32p . which is the same as ()80p . ■ 



Verifications of (I23p . (I24p and ()25p . We consider the three one by one. 
(i) If xii ~ A^(0, n~^) as in (fT9]l . then ^ and r/ are i.i.d. with distribution A^(0, 1). By 
Lemma 3.2 from Jiang (2005), hix) = (x-l-logx)/2 for x > 0. So /2(l/2) > 1/12. Also, 
since Ee^^"^ = E'e"^^^/^ = (1 - O'^y^^'^ for \9\ < 1. It is straightforward to get 

V42;2 + 1-1 1 V4x2 + 1 + 1 
A(a;) = -log , x>0. 



Lety = v^liHI^. Then y > 2x73 for aU |x| < 4/5. Thus, /i(x) = y-ilog(l + y) > | > ^ 
for \x\ < 4/5. Therefore, g{t) > min{/i(|), ^} > min{Y2 5 1^} = j2 for 1^1 ^ 1- Since 
1/(2A; - 1) < 1 if A; > 1. By Proposition UT] we have 

P {i2k - 1)L„ < l) > 1 - 3p^ exp { - j^^^} (81) 
for all n > 2 and A; > 1, which is (j23p . 

(ii) Let xii be such that P(xii = ±l/-y/n) = 1/2 as in (j20p . Then ^ and in Proposition 
Oare i.i.d. with P(^ = ±1) = 1/2. Hence, P{^r] = ±1) = 1/2 and = 1. Immediately, 
/2(1) = and /2(x) = +oo for ah x / 1. If a = log | ~ 0.405, then ^^(Z^e^l^l) = e" < | 
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with Z = ^rj. Thus, by Lemma 16.81 -^1(2^) ^ for all < a; < | < Therefore, 

g{t) > 1^ for < t < |. This gives that 

P ({2k - l)Ln < 1) > 1 - exp 



n 



I2(2k - 1) 



1)2} 



(82) 



provided — §5 ^^^^ is, k > We then obtain (|24l) since k is an integer, 
(iii) Let xn be such that P{xu = iyW^) = 1/6 and P{xii = 0) = 2/3 as in (HI]). Then 
^ and T] in Proposition 14. II are i.i.d. with P{C = ±\/3) = 1/6 and P(^ = 0) = 2/3. It follows 
that P{Z = ±3) = 1/18 and P{Z = 0) = 8/9 with Z = ^r/. Take a = i log | > 0.13. 
Then E{Z^e'^\^\) = ^e^" = |. Thus, by Lemma ESI /i(x) > x'^/3 for ah < x < ^ = 
i log I ~ 0.2027. Now, P(^2 ^ 3) ^ 1 ^ ;l - P(^2 ^ g). Hence, ~ Ber{p) with 



3- 



It follows that 



hix) = sup|(30)--logi?e3^(«'/3)| 
em 1-3 J 



A* 



X\ X 



log X + ( 1 



log 



?> — X 



for < X < 3 by (b) of Exercise 2.2.23 from [2T]. Thus, hi^) = ^ log ^ + 1 log | ~ 0.0704 > 



■j^. Now, for < t < |, we have 



,(t)=mm{/,(^),/,(i)}>mm{{^,l} = ^. 

Easily, t := 2k -1 — I ™^ only if A; > |. Thus, by Proposition 14.11 
p((2.-l)L„<l)>l-3p^exp{-^^^^^} 

for all n > 2 and k > j. We finally conclude (I25p since A; is an integer. 



(83) 



Proof of Lemma l6.51 (i) First, since Xj^'s are i.i.d. bounded random variables with mean 
zero and variance one, by (i) of Lemma 16.41 



P{Vn/\ogpbnA>K) = P 

< p-p 

< p ■ e 



max 

l<i<p 



1 " 
y/n logp ^ 

1 I 
Vnlogp^ 1 



> K 



(84) 



> K 



-(i^2/3) logp 



1 



P 



(85) 



as n — )• cx) for any K > ^/3. This says that {\/n/ logpbn^A,} are tight. 
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Second, noticing that |t — 1| < — 1| for any t > and nhf 



T I - 1 2 

2 ' ^ I 2 I ? 



we get that 



&n,i ^ max — 1| < max 



1 ^ 

-E( 

fc=i 



t2 



+ max 

l<i<p 



1 " 



k=l 



where Z„ = maxi<i<p WYlk=i{'^ki ~ ^) \ ■ Therefore, 



n 



-bnA < 



n 



-Zn + 



logp 



n \2 

-0n,4 



(86) 



(87) 



log p Y ^og p V n V y log p 

Replacing "x^j" in (j84p with — 1" and using the same argument, we obtain that 
{\/riJ\ogp Zri} are tight. Since logp = o(n) and {y^n/ log p 6^,4} are tight, using (p7|) we 
know the second term on the right hand side of ()87p goes to zero in probability as n — t- 00. 
Hence, we conclude from ([87|) that {\/n/ logp6.„^i} are tight. 

Finally, since logp = o{n) and {y^n/logp6„,^i} are tight, use ([27|) to have 6„^i — in 
probability as n — )■ 00. This implies that 6^,3 — )• 1 in probability as n — )■ 00. 

(ii) By (f85|) and ([87|) . to prove the conclusion, it is enough to show, for some constant 
> 0, 



p-P 



p-P 



1 



y^nTogp 
1 



n log p 



^ajfcil > j — > and 
fc=i 

n 



> A' 



(89) 



k=i 



as n — 00. Using a„ := \/log p„ = o{n^^'^) and (i) of Lemma [631 we have 



P 



P 



1 " I 1 



1 



A/nTogp 



A:=l 
n 

fc=i 



/3 



and 



1) 



> A' < 



as n is sufficiently large, where the first inequality holds provided £^exp (tol^^iiP'^^^"^^'^^) = 
Eexp(to|2;ii|°/^) < 00; the second holds since E'exp (tola^n - = Sexp(to|a;ii — 

< 00 for some to > 0, which is equivalent to £^6*0 1^^^'" < 00 for some > 0. We then 
get (IMD and ([89]) by taking K = 2. ■ 



Proof of Lemma S Let G„ = {| ^ILi ^L/'^ " 1| < '^i- Then, by the Chernoff bound 
(see, e.g., p. 27 from Dembo and Zeitouni (1998)), for any b G (0, 1), there exists a constant 
C5 > such that P(G.^) < 2e""'^« for all n > 1. Set a„ = t^V^Hog]?. Then 



XhXXk-A > ar. 



k=l 



+ 2e 



-nC's 



(90) 
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for all n > 1. Evidently, 1x^1X^21 < C*^, E^{xkiXk2) = and E^{xkiXk2)^ = xf,^, where 
stands for the conditional expectation given {x^i, 1 < k < n}. By the Bernstein inequality 
(see, e.g., p. Ill from Chow and Teicher (1997)), 

P^(\^XkiXk2\>an) iGr. < 4-exp| -— ^— fi— — -}/g„ 

,2 



< 4.exp{-- 



((l + 5)n + CV)J 
1 

- pty{i+2S) ^^^^ 

as n is sufficiently large, since a^/(n(l + 5) + C^a„) ~ t'^ [log p) / {1 + 6) as n — )• 00. Recalling 
(j90p . the conclusion then follows by taking 6 small enough. ■ 



Proof of Lemma 16. 71 Let stand for the conditional probability given {xk2, 1 < ^ < n-}- 
Since {xij; i > 1, j > 1} are i.i.d., to prove the lemma, it is enough to prove 



" 2 / X \ 

l-n := E^P^(^\J2xkiXk2\ > tnVnIogp) } = O y-^J 



(92) 



Y^*^ 2 



as n — ^ 00. We do this only for convenience of notation. 
Step 1. For any x > 0, by the Markov inequality 

P{ max \xk2\ >x)< nP(|xi2| > x) < Cne"*'"'" (93) 

l<fc<n 

where C = Se*"'^^^'" < 00. Second, the given condition implies that Ee^'^^'^'^'^^'^^^^^'^^'' < 00 
for any t > 0. For any e > 0, by (ii) of Lemma [631 there exists a constant C = > such 
that 

n('9+i)/2 
for each n > 1. 

Set hn = /u„ = Exijl{\xij\ < hn), 

Zjj = Xij/(|Xij| > hn) - EXijI{\xij\ > hn) (95) 

for all i > 1 and j > 1. Then, Xij = yij + Zij for all i,j > 1. Use the inequality P{U + V> 
u + v) < P{U > n) + P{V > v) to obtain 

n 2 

-P^(| ^a;fciXfc2| > tn^/n logp 
fc=i 

n 

< 2p2(^|^yfciXfc2| > (tn - (5)V?^logp)% 2P2(^| ^ZfciXfc2| > (JV^logp 
fc=l A:=l 

:= 2An + 2Bn (96) 
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for any 5 > small enough. Hence, 

< 2EAn + 2EBn (97) 

for all n > 2. 

Step 2: the bound of An- Now, if maxi<fc<„ 1x^21 < hn, then \ykiXk2\ < 2/i^ for all A: > 1. It 
then follows from the Bernstein inequality (see, e.g., p. Ill from Chow and Teicher (1997)) 
that 



An = P"^ {\^ykiXk2\ > {tn - 6)y^nlogp 



k=l 



< 4.exp| {tn-Sfnlogp | 

^ , r {tn - dfnhgp -1 



E{yu){n + en(/3+i)/2) + 2/i2 (t„ - 6)Vn[^^ 

for < 5 < t„ and ' rZ+'^j/z""' < e. Notice E{yf^) 1 and 2hl{tn - 5)Vnlogp/3 = o{n) 
as n — >• oo. Thus, 

{tn - 5)^nlogp 



~ (t — 5) logp 

I y^"_ x'^ —n\ 

as n — 7- oo. In summary, if maxi</j.<„ \xk2\ ^ and (/3+iy2 — — then for any 6 G 



E{yli){n + en(^+i)/2) + 2hl{tn - 5)^/n[^ 

> oo. ] 

(0,t/2) 



^« ^ ^ (98) 

as 71 is sufhciently large. Therefore, for any e > small enough, take 5 sufficiently small to 
obtain 

2> 



EAn = J^yfciXfc2| > (tn-5)\/nlogp) } 



k=l 



< + P( max \xk2\ > K) + P( ' ^l^fi V2~ ^ ^ 

l<fc<n V n'.P+iVS 



< ^ + ^^e"'" + e"""^"' = O ) (99) 

as n — 7- oo, where the second inequality follows from (j93p and (I94p . and the last identity 

follows from the fact that /i" = and the assumption logp = o{n^). 

Step 3: the hound of Bn ■ Recalling the definition of Zij and fin in (195p , we have 



Bn = P"^ \\^ ZklXk2\ > (5\/nlog 



k=l 



- (\^^kiXk2l{\xki\ > K}\ > 6^/nlogp/2) +l(\'^Xk2\ > , -J^iT^h 
^ k=i ^ fc=i +1^"!^ 

:= Cn + Dn. (100) 
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Now, by ([93]), 



C„ < P( max \xki\ > K) < Cne^^'-'^" = Cne-*o"^ (101) 

l<fc<n 



Easily, < > K) < e-^o'^^/^^dxnle^ol^nl^/^) = Ce-*''"''/^. Also, P(| Y.l=iVk\ > 

x) < Yl^=i ^il'Hkl ^ x/n) for any random variables {r]i} and 2; > 0. We then have 

EDn = P[\2_^Xk2\ > 



^ D, I I ^ J^/nlogp 
< nF( \xu\ > 



277,(e"" + 

< nPflxiil > e*«"''/3W e"" (102) 



as n is sufficiently large, where the last inequality is from condition i?e*°l^^^l° < 00. Conse- 
quently, 

EBn < 2E{Cl) + 2E{Dl) = 2E{Cl) + 2E{D^) < e"^"' (103) 
as n is sufficiently large. This joint with (j97|) and ([99]) yields (j92]) . ■ 

Proof of Lemma 121 Take 7 = (1 - /3)/2 G [1/3,1/2). Set 

r?i = Ciim < n^), fin = Et]i and = Var{r]i), 1 < i < n. (104) 

Since the desired result is a conclusion about n — )• 00, without loss of generality, assume 
cr„ > for all n > 1. We first claim that there exists a constant C > such that 

max{|/i„|, \an - 1|, ^(161 > n^)} < Ce"""/^ (105) 

for all n > 1. In fact, since E^i = and aj = /3, 

l/ini = \E^iI{\^i\ > n^)\ < mi\im > n^) < • e-*""'/^ (106) 

for all n > 1. Note that \an — 1| < l^n ~ 1| = /^^n + -^Ci-^dCil > '^'^)^ by the same argument 
as in (|106p . we know both |o"„ — 1| and -Pd^^il > ri^) are bounded by Ce~^^^^ for some 
C > 0. Then (fT05]l follows. 

^tep i. We prove that, for some constant C > 0, 

Ipf ^ >yr^-P{ ^=''^' > yn) I < 2e-"'/^ (107) 

for all n > 1. Observe 

= rji for 1 < i < n if max < n'*'. (108) 

l<i<n 
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Then, by (fT05]) . 



P(^^=>y„) < P(^^=>y„ max 1^,1 <n^)+p(U{|6| >n^} 

\^/n log Pn ' \^Jn\og-pri l<J<n / V 



< P( > y„W Cjie-"'/^ (109) 



for all n > 1. Use inequality that P{AB) > P{A) — P{B^) for any events A and B to have 

pi Sn ^ > > pl - > y„, max < n'^ 

y^/nlogPn ) \^n\ogpn \<%<n 

p(J^l^l2= > y„, max < 

" l<i<n 



Sn 


\Jn log 






Vi 


\/n log 


:pn 




Vi 


y/n log 





where in the last step the inequality P(maxi<j<„ > n'^) < Cne "■''/C' ig used as in ()109p . 
This and (fT09D concludes (fTOTD . 
Step 2. Now we prove 



P^^>y„ (110) 



as n — )• oo, where 



a^n = y^VlogPn and yn = — (yn- ^.h-^ fin] ■ (HI) 

First, by (fT05]) . 

\y'n - yn\ < + " " .f^W\ < Ce""'/^ (112) 

for all n > 1 since both cr„ and y„ have limits and p.„ — )■ oo. In particular, since logp„ = 

xn = o{n^/') (113) 

as n — oo. Now, set 

= ~; — 

for 1 < i < n. Easily 

> Vr) = P(^^ > (114) 
Vnlogpn, / \y/n\ogPn ' 
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for all n > 1. Reviewing (|104p . for some constant K > 0, we have < Kn"^ for 1 < i < n. 
Take Cn = Kn"'~^/'^. Recalling x„ in (jllip . It is easy to check that 

'^'^ ( ^'^?) ^ X] -^l^il^ ~ '^C', < c„s„ and < c„ < 1 

i=l i=l 

as n is sufficiently large. Recall 7 = (1 — /3)/2, it is easy to see from (jll3p that 

1 

< Xn < 



18c„ 

for 77, large enough. Now, let 7(2;) be as in Lemma |6.3| since /? < 1/3, by the lemma and 



7(^) <^:^ = o(n^-t)^0 and ' -J^j.n ^ o(^(/^-i)/2) ^ q 



■ s 



as 72 — )• 00. By (jllip and (jll2p . = y'^y/nlogp^ and — )• 00 as ti — )• 00. Use Lemma 
and the fact 1 — ^(t) = i- e~*^/^ as t — )• +00 to obtain 

V ZTTt 



P(^f& > y'J =p(j2v'i> XnSn) ~ 1 - Hxn) ~ (115) 

VVralogp„ / y \/27rXn 

as 77, — )■ oo. This and (I114p conclude (jllOp . 
^tep 5. Now we show 

^S=~ 7^= :=^n (116) 

as 71 — )■ oo. Since Un ^ U and (7„ — )• 1, we know from (|112p that 

^/2^x„ = V2^y'^{\ogpnf'^ ~ V2^y (logpn)'/' (117) 
as 77 — )■ oo. Further, by (jllip . 

= exp { - ^ + I logp„} = exp {i (y2 _ logp„}. (118) 

Pn 

Since 7/„ — )■ y, by (|112p . both {yn} and {y'n} are bounded. It follows from (jll2p again that 
l?/n ~ I ^ C'lyn " y'n\ = 0(e~"''/'^) as 77 — )• OO. With assumption logpn = 0(77^^) we get 
^ pn^"^^"^ as 77 — )■ 00, which combining with (|117p yields (jll6p . 
Finally, we compare the right hand sides of (jl07p and (jll6p . Choose C > max{y^; n > 
1}, since logp„ = 0(77'^), recall cOn in (|116p . 

^-^^^ = 2^/2^y(logp„)VV„^/2e-^/^ 



O (77/^/2. exp{c'logp„-^}) 
O (^77^/2 • exp I 



77^ I 

2Cf 
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as n ^ oo for any constant C > 0. This fact joint with (I107p . (jllOp and (jll6p proves the 
lemma. ■ 



Proof of Lemma [67TOl For any Borel set ^ C M, set P2{A) = P{A\uki,Uk3, I < k <n), 
the conditional probability of A with respect to Uki,Uk3, 1 < k < n. Observe from the 
expression of S4 that three sets of random variables {ukijU^s; 1 < k < n}, {uk2', 1 < k < n} 
and {iife4; 1 < A; < n} are independent. Then 



-P(| '^UklUk2\ > an, I '^UksUkil > an^ 
k=l k=l 

n n 

E{Pl{\^UklUk2\ > ar^P2{\^Uk3UkA\ > ^n) | 



k=l k=l 

n 



< \e P2{\^UkiUk2\ > ayi^ } ■ \e P2{\^UkzUkA\ > I 



2^1/2^ / " \ 2i 1/2 

k=l k=l 

by the Cauchy-Schwartz inequality. Use the same independence again 



P2{\^UklUk2\ > anj = P{\^UklUk2\ > On Ukl, I < k < nj; (119) 

fc=l k=l 

n n 

P2{\'^UkzUki\ > an) = Pll'^UksUkil > an Uk3, I < k < n) . (120) 



k=l k=l 

These can be also seen from Proposition 27 in Fristedt and Gray (1997). It follows that 



sup P(^\'^UklUk2\ > an, \ '^Uk3Uk4\ > an 
k=l k=l 

2- 



|r|<l 

< -E|p(^| ^nfciUfc2| > Onjuii, • • • |. 



k=l 



Since {uki; \ < k < n} and {uk2', I < k < n} are independent, and t„ := an/ y/nlogp — )• 
t = 2, taking a = 2 in Lemma [6.71 we obtain the desired conclusion from the lemma. ■ 

Proof of Lemma 16.111 Since S4 is always non-negative definite, the determinant of the 
first 3x3 minor of S4 is non-negative: 1 — — > 0. Let = y^l — — and {ttfes; 1 < 
k < n} he i.i.d. standard normals which are independent of {ukf, l<i<4;l<A;< n}. 
Then, 

{uu,ui2,ui3,uu) = (wii, ni2, nnii + r2Ui2 + rs^is, um). 
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Define Zij = \ X]fc=i UkiUkj\ for 1 < j < 5 and = r^. By the Cauchy- Schwartz inequality, 

n n 

\^^{riUki + r2Uk2 + r'iUk^)ukA\ < ^ \ri\ ■ \^^UkiUki\ 

k=l fie{l,2,5} k=l 

< [rl + 4 + riy^'{^Zl + Zl + Zl,]'^' 

< \/3 • max{Zi4, Z24, ^54}- 

It follows from the above two facts that 



n n 
-P(| ^ UklUk2\ > «n, I ^ ^^fc3^^fc4| > an 



k=l k=l 



< p{Zi2 > an, max{Zi4, Z24, ^54} > ^ 

< -P(-^12 > an, Zi4 > 
^6{1,2,5} ^ ^3 



= 2P(Zi2 > an, Zu > ^) + ^(^12 > «n) " ^(^54 > ^) (121) 

by symmetry and independence. For any Borel set ^ C M, set P^{A) = P{A\uki, 1 < k < 
n), the conditional probability of A with respect to Uki, 1 < k < n. For any s > 0, from 
the fact that {uki},{uk2} and {uk^} are independent, we see that 

p(Zi2 > an, Zu > san) = e(^P\Zu > an) ■ PHZu > Sfln)) 

1/2 r„„i,„ .91 1/2 



< [EP\Zi2>anf} ■ [e pHZu > sanf} 



by the Cauchy-Schwartz inequality. Taking t„ := anj \Jn logp — )• t = 2 and t„ := sanj \Jn logp 
t = 2s in Lemma |6.7| respectively, we get 



£;pi(Zi2 >a„)2 = 0(^p-4+^j and EP^{^Z^a> sanf = 0[]r^' 
as n — )• oo for any e > 0. This implies that, for any s > and e > 0, 

P(Z12 > an, ^14 > Sa„) < 0(p-2-2s2+e\ (^22) 



as n ^ cxD. In particular, 

p(Zi2 > an, ^14 > < (123) 

as n — )• oo for any e > 0. 

Now we bound the last term in (I12ip . Note that |uiitii2| < + '"i2)/2i it follows that 
£;g|«ii«i2|/2 < oo by independence and £^exp(iV(0, 1)^/4) < oo. Since {uk\,Uk2\ \<k<n\ 
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are i.i.d. with mean zero and variance one, and y„ := an/ ^Jn\ogp — ?• 2 as n ^ oo, taking 
a = 1 in Lemma 16.91 we get 

n 



1 

P(Zi2>a„,) = P{ . ^ - \y^UkiUk2\ > 



A/n logp ^ Vnlogp 
p-^.V2(logp)-V2 1 

as n — )• oo. Similarly, for any t > 0, 

p(Zi2>ta„) =0(p-2t^+^) (125) 
as n — )• oo (this can also be derived from (i) of Lemma l6.4p . Li particular, 

P(Z.>^) = P(z.>^)=0(p-i-) (126) 

as n — )■ OO for any e > 0. Combining (jl24p and (jl26p . we know that the last term in (jl2ip 
is bounded by 0(p~3+'^) as n — t- oo for any e > 0. This together with (I12ip and (jl23p 
concludes the lemma. ■ 

Proof of Lemma 16. 121 Fix b G (0, 1). Take independent standard normals {nfc5,Ufc6; 1 < 

A; < n} that are also independent of \uhi] 1<^<4; \ <k < n\. Then, since {uki,Uk2,Uk5,Uke', 1 < 

k < n} are i.i.d. standard normals, by checking covariance matrix S4, we know 

(till, ni2, ni3, U14) = (nil, ^12, ^-"11 + r[ui5,r2Ui2 + rgUie) (127) 



where r[ = ^J\ — r\ and r'2 = ^1 — r^. Define Zjj = | 'Y^=\ "^kiUkjl for 1 < i, j < 6. Then 

n 

I ^(n^^fei + r[uk5){r2Uk2 + r-gUfee)! 
fc=i 

< |rir2|Zi2 + \r1r2\Z1e + \r[r2\Z25 + \r[r2\Z5Q 

< (1- 5)2^12 + 3max{Zi6,Z25, ^56} (128) 
for all In I, \r2\ < 1 - 5. Let a = (1 + (1 - S)'^)/2, (3 = a/(l - 6f and 7 = (1 - a)/3. Then 

/3 > 1 and 7 > 0. (129) 

Easily, if Z12 < maxjZig, .^25) -^56} ^ 7ani then from ()128p we know that the left hand 
side of (I128j) is controlled by a„. Consequently, by (1127P and the i.i.d. property, 

n 

P{Zi2 > a„, Z34 > an) = P{Zi2 > an, \ ^(n^fcl + r[uk->){r2Uk2 + ?^2^fc6)l > 

k=l 

< P(Zi2>a„,Zi2>/3a„)+ ^(^12 > n ^ 

ie{l,2,5} 

= P(Zi2 > /3a„) + 2P(Zi2 > an, ^le > 7an) 

+ P(Zi2 > an) ■ P{Z56 > lan) (130) 
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where "2P(Zi2 > a„,Zie > 7a„)" comes from the fact {Zi2,Ziq) = (Zi2,^26)- Keep in 
mind that (^12, ^le) = (-2^125 ■^14) and Z^q = Z\2- Recall (|129p . applying (|122|) and (|125p to 
the three terms in the smn on the right hand side of (jl30p . we conclude (j55p . ■ 

Proof of Lemma 16.131 Reviewing notation defined below (|68p . the current case is 
that d\< d-i< with d = ((ii, ^2) and d' = (^3,^4). Of course, by definition, di < d2 

and ^3 < ^4. To save notation, define the "neighborhood" of di as follows: 



{dG {!,••• < r} (131) 



for i = 1,2,3,4. 

Given di < d2, there are two possibilities for (^4: (a) d^ — d2 > t and (b) < (i4 — ^2 < t. 
There are four possibilities for ^3: (A) ^3 G N2\Ni; (B) ^3 G Ni\N2] (C) ^3 G A^inA^2; (D) 
d^ ^ NiU N2. There are eight combinations for the locations of ((i3, d^) in total. However, 
by ()67p the combination (a) &: (D) is excluded. Our analysis next will exhaust all of the 
seven possibilities. 

Case (a) & (A). Let ^a,A be the subset of {d,d') G O3 satisfying restrictions (a) and (A), 
and others such as i^b,c are similarly defined. Thus, 

P{Zd > an, Zd^ >an)<Y, Yl ^^^d > On, Zd> > a„) (132) 

{d,d')eQ3 e,e (d,d')6f^e,e 

where 6 runs over set {a, b} and O runs over set {^4, B, C, D} but {9, 0) 7^ (a, D). 
Easily, \^a,A\ ^ ^P^ and the covariance matrix of {'Wd2,Wd^,Wd.^,'W(i4^) (see ([70]) ) is 

/I 7 0\ 



10 
7 1 



ItI < 1- 



Vo 1/ 

Take e = 1/2 in Lemma 16.101 to have P{Zd > an,Zd> > an) = Pn = o(p~^/^) for all 
{d,d') G n^^A- Thus 

J2 P{Zd > an, Zd' > an) = \R\-pn^O (133) 

id,d')£R 
as n — )• 00 for R = ^a,A- 

Case (a) & (B). Notice |ria,B| < tP'^ and the covariance matrix of {'Wd^,Wd2,Wd.^,Wd^) is 
the same as that in Lemma 16.101 By the lemma we then have (jl33p for R = ila,B- 
Case (a) & (C). Notice 1^2^ (7 1 < r^p^ and the covariance matrix of {wd-^^-iWd^-iWdzi'^di) is 
the same as that in Lemma 16.111 By the lemma, we know (jl33p holds for R = ^a,c- 
Case (b) & (A). In this case, \^h,A\ < r^p^ and the covariance matrix of (t(;rf3 , , , tf^di ) 
is the same as that in Lemma 16.111 By the lemma and using the fact that 

P{Zd > an,Zd' > an) = P{Z(^dz4i) > aru-^(d2,rfl) > ^n) 
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we see (|133p holds with R = ^lb,A- 

Case (b) & (B) . In this case, | ^lb,B \ < t^P^ and the covariance matrix of (wd^ , , Wd^ , ) 
is the same as that in Lemma 16.121 By the lemma, we know ()133p holds for R = i^b,B- 
Case (b) &z (C). We assign positions for di,d3,d2,d4 step by step: there are at most p 
positions for di and at most k positions for each of ^3,^2 and d^. Thus, \^b,c\ < t^P- By 

(HMD, 

n ^ 

P{Zd > an,Zd' > an) < P{Zd > an) = P{\^Cir]i\ > a„) = O^— ) 

i=i ^ 

as n — )• 00, where {^i,??^; ^ > 1} are i.i.d. standard normals. Therefore, (jl33p holds with 
R = Vtb,c- 

Case (b) & (D). In this case, \^b,c\ ^ ^P^ and the covariance matrix of {wd^,Wd3,Wd2,Wd^) 
is the same as that in Lemma 16.101 By the lemma and noting the fact that 

P{Zd > an, Zd' > an) = P{Z(^dA4-A) > On, Z(^d2,di) > On) 

we see ()133p holds with R = ^b,D- 

We obtain ([75]) by combining (|133p for all the cases considered above with (jl32p . ■ 
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